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H olding Teacher Preparation Accountable: 

A Review of Claims and Evidence 

Marilyn Cochran-Smith, Rebecca Stern, Juan Gabriel Sanchez, Andrew Miller, 
Elizabeth Stringer Keefe, M. Beatriz Fernandez, Wen-Chia Chang, Molly 
Cummings Carney, Stephani Burton, &Megina Baker, Boston College 


Executive Summary 

Teacher preparation has emerged as an acutely politicized and publicized issue in U.S. 
education policy and practice, and there have been fierce debates about whether, how, by 
whom, and for what purposes teachers should be prepared. This brief takes up four major 
national initiatives intended to improve teacher quality by “holding teacher education 
accountable” for its arrangements and/or its outcomes: (l) the U.S. Department of 
Education’s state and institutional reporting requirements in the Higher Education Act 
(HEA); (2) the standards and procedures of the Council for the Accreditation of Educator 
Preparation (CAEP); (3) the National Council on Teacher Quality’s (NCTQ) Teacher Prep 
Review; and (4) the edTPA uniform teacher performance assessment developed at Stanford 
University’s Center for Assessment, Learning, and Equity (SCALE) with aspects of data 
storage and management outsourced to Pearson, Inc. 

These four initiatives reflect different accountability mechanisms and theories of change, 
and they are governed by different institutions and agencies, including governmental 
offices, professional associations, and private advocacy organizations. Despite differences, 
each assumes that the key to teacher education reform is accountability in the form of public 
assessment, rating, and ranking of states, institutions, programs, and/or teacher candidates. 
This brief addresses two questions for each initiative: What claims do proponents of the 
initiative make about how it will improve teacher preparation and thus help solve the teacher 
quality problem in the U.S.? What evidence supports these claims? The first question gets 
at the theory of change behind the initiative and its proponents’ assumptions about how 
particular mechanisms actually operate to create change. The second involves the validity 
of the initiative as a policy instrument—that is, whether or not there is evidence that the 
initiative actually meets (or has the capacity to meet) its stated aims. 

This review has two major conclusions. The first is that across three of the four initiatives 
(HEA regulations, CAEP accreditation, and NCTQ’s reviews), there is thin evidence to 
support the claims proponents make about how the assumed policy mechanisms will actually 
operate to improve programs. The advocates of these initiatives assume a direct relationship 
between the implementation of public summative evaluations and the improvement of 
teacher preparation program quality. However, summative evaluations intended to influence 
policy decisions generally do not provide information useful for program improvement. The 
irony here is that while these policies call for teacher education programs and institutions 
to make decisions based on evidence, the policies themselves are not evidence-based. Thus 
there is good reason to question their validity as policy instruments that will have a positive 
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impact on teacher education quality. In contrast, the edTPA has some evidentiary support 
as a policy initiative, but concerns within the collegiate teacher preparation community plus 
state implementation problems suggest that widespread implementation and professional 
acceptance may be challenging to accomplish. 

Our second conclusion is that although all of the accountability initiatives we reviewed are 
intended to diminish educational inequality, underlying most of them is a notion of what 
we call thin equity. When policies work from a thin equity perspective, the assumption is 
that school factors, particularly teachers, are the major source of educational inequality and 
that access to good teachers is the solution to the equity problem. This viewpoint ignores 
the fact that teachers account for a relatively limited portion of the overall variance in 
student achievement, and it does not acknowledge that inequality is rooted in and sustained 
by much larger, longstanding, and systemic societal inequalities. In contrast, a strong 
equity perspective acknowledges the multiple in- and out-of-school factors that influence 
student achievement as well as the complex and intersecting historical, economic, social, 
institutional, and political systems that create inequalities in access to teacher quality in 
the first place. A strong equity perspective assumes that teachers and schools alone cannot 
achieve equity; rather, it requires educators working with policymakers and others in larger 
social movements to challenge the intersecting systems of inequality in schools and society 
that produce and reproduce inequity. Working from a strong equity perspective also includes 
focusing directly on creating the conditions for high-quality teaching, such as supports for 
teachers and students, stable and supportive leadership, intensive interventions to close 
opportunity gaps for students in the early grades, and well-supported teacher induction 
programs. 

In something of a contrast to the other three initiatives we review in this brief, the edTPA 
defines teacher quality in terms of teachers’ knowledge, skill, and professional judgment, 
including supporting English language learners. However, even the edTPA does not prioritize 
creating the conditions necessary for strong equity. These include preparing and expecting 
teachers to: recognize and build on the knowledge traditions of marginalized groups; 
understand and challenge inequities in the existing structures of schools and schooling; and 
work with others in larger efforts for social justice and social change. 


Recommendations 

Although debate remains, educators and policymakers at multiple points along the political 
spectrum are increasingly recognizing that reforming teacher preparation is an important 
part of larger efforts to improve the schools and enhance students’ learning. Based on our 
critique of claims and evidence related to four major national accountability initiatives, we 
offer the following policy recommendations. 

• Policymakers must acknowledge and address the multiple factors—in addition to 
teacher quality—that influence student outcomes, including in particular the im¬ 
pact of poverty, family and community resources, school organization and support, 
and policies that govern housing, health care, jobs, and early childhood services. 

• Systems evaluating teacher preparation must produce results that preparation pro¬ 
grams can use to change and improve curricula, practice-based experiences, and 
assessments—not results that simply grade programs without information about 
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why or how particular results occurred or what might improve them. 

• Systems evaluating teacher preparation programs must be built on policy mecha¬ 
nisms that have documented capacity to produce usable information for local and 
larger program improvement within a complex policy and political climate. 

• There should be a conceptual shift away from teacher education accountability that 
is primarily bureaucratic or market-based and toward teacher education respon¬ 
sibility 1 that is primarily professional and that acknowledges the shared respon¬ 
sibility of teacher education programs, schools, and policymakers to prepare and 
support teachers. 

• Evaluations of teacher preparation programs should: 

o Reflect alternative forms of accountability that shift the focus from external¬ 
ly generated single-measure tests to multi-pronged internal assessments of 
teacher performance and student learning. 2 

o Avoid “placing too much weight” on value-added assessments of program 
graduates’ and programs’ effectiveness. Evaluations of preparation programs 
should not be based solely or primarily on students’ test scores. This is consis¬ 
tent with recommendations in the National Academy of Education report on 
teacher preparation evaluation. 3 

o Consider teacher educators’ performance (defined as knowledge, practice, 
commitments, and professional judgment as they play out in the construc¬ 
tion and operation of programs), teacher candidates’ performance (defined as 
knowledge, practice, commitments, and professional judgment as they play 
out in classrooms and schools), and students’ learning (defined as academic 
learning, social/emotionallearning, moral/ethical development, and prepara¬ 
tion for participation in democratic society). 

o Recognize that teacher preparation programs have multiple, often complex, 
goals and purposes, including preparing teachers to challenge inequitable 
school and classroom practices and work as agents for social change. These 
goals, which are consistent with a “strong equity” perspective, should be re¬ 
flected in evaluation processes. 
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H olding Teacher Preparation Accountable: 

A Review of Claims and Evidence 

In this policy brief, we focus on one of the major contemporary trends 4 in teacher educa¬ 
tion-complex and far-reaching policies intended to enhance teacher quality by “holding 
teacher education accountable” for its arrangements and/or its outcomes. We take up four 
national initiatives that share this intention, but are based on differing incentives and dis¬ 
incentives assumed to drive change. These include: (l) the U.S. Department of Education’s 
(DOE) proposed state and institutional reporting requirements in the Higher Education Act 
(HEA); (2) the standards and procedures of the Council for the Accreditation of Educator 
Preparation (CAEP); (3) the National Council on Teacher Quality’s (NCTQ) annual/biennial 
Teacher Prep Review; and, (4) the edTPA, a nationally available uniform teacher perfor¬ 
mance assessment developed at Stanford University’s Center for Assessment, Learning, and 
Equity (SCALE) with aspects of data storage and management outsourced to Pearson, Inc. 

These four initiatives reflect different accountability mechanisms and theories of change, 
and they are governed by different institutions and agencies, including governmental of¬ 
fices, professional associations, and advocacy organizations. Despite differences, each as¬ 
sumes that teacher education in the U.S. requires reform 5 and that the key to reform is 
accountability in the form of public assessment, rating, and ranking of states, institutions, 
programs, and/or teacher candidates. Two of the initiatives, HEA regulations and state-re¬ 
quired use of the edTPA for licensure, involve direct or bureaucratic accountability, 6 which 
means that federal or state offices directly determine rewards and punishment. NCTQ’s re¬ 
view instead involves indirect market accountability, generating information to influence 
prospective “consumers” of preparation programs and/or to promote alterations in the pol¬ 
icies and practices of institutions seeking higher ranking. CAEP accreditation and some in¬ 
stitutional uses of the edTPA are based on what we call indirect professional accountability, 
which involves self-policing and self-governance. 

Because these initiatives share the assumption that accountability is the central mechanism 
for reforming teacher preparation and thus boosting teacher quality, we treat them together 
in this policy brief. We take up the initiatives one at a time, addressing the same two ques¬ 
tions for each: What claims do proponents of the initiative make about how it will improve 
teacher preparation and thus help solve the teacher quality problem in the U.S.? What evi¬ 
dence supports these claims? The first question gets at the theory of change behind the ini¬ 
tiative and the assumptions its proponents make about how particular mechanisms actually 
operate to create change. Answers to the second question, which make up the bulk of this 
brief, involve the validity of the initiative as a policy instrument—that is, whether or not 
there is evidence that the initiative actually meets its stated aims. 


Federal Teacher Preparation Reporting Regulations 

The Higher Education Act (HEA) governs the administration of federal student aid pro¬ 
grams, and both states and teacher education providers must meet its Title II reporting 
requirements to be eligible to distribute TEACH grants to students who are prospective 
teachers. At the end of 2014, the Obama administration released proposed new Title II re- 
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porting regulations, which required states to rate and report on each preparation program 
annually according to: value-added assessments of graduates’ effectiveness; employment 
information; consumer satisfaction data; and program accreditation. 7 States were to classi¬ 
fy programs into four performance levels and require low performers to improve or close. 
These proposed regulations prompted many jurisdictional and other controversies, 8 and re¬ 
visions to the proposed regulations were announced in December 2015. The revisions give 
states more authority as well as more flexibility about how to weigh the results of statewide 
standardized achievement tests in annual reports, what other measures of students’ learning 
to include, and how to determine at-risk and low-performing programs. 9 These regulations 
function as a vehicle for direct bureaucratic accountability in the form of state-developed 
and enforced federal reporting requirements. 


Rationale: What Claims Are Made about Reporting Regulations? 

It remains to be seen how the revisions to the HEA reporting regulations will play out in 
practice across the various states, but on paper at least, revised HEA regulations along with 
the newly reauthorized Every Student Succeeds Act (ESSA) will create a lesser level of fed¬ 
eral control of teacher preparation than the initially proposed regulations. However, much 
of the rationale behind the revised HEA reporting requirements for teacher preparation 
remains the same. Proponents of new reporting regulations assert that states and teacher 
preparation institutions have generally failed to improve teacher education quality because 
they have gathered “meaningless inputs-based” data 10 that does not identify program quality 
and cannot be used for improvement. 11 Proponents assert that what is needed are transpar¬ 
ent, comprehensive data systems that include “more meaningful indicators of program in¬ 
puts and program outcomes, such as the ability of the program’s graduates to produce gains 
in student learning.” 12 The rationale here is that data systems linking student achievement 
data to teacher data to preparation program data create “a much-needed feedback loop to 
facilitate program improvement and provide valuable information” that will put more effec¬ 
tive new teachers in high needs schools. 13 

This rationale depends on three claims. The first is that an accountability system required by 
the federal government but developed and implemented by individual states is an effective 
mechanism to control teacher preparation by identifying good and bad programs and thus 
improving teacher quality. The second claim is that making reporting data public will help 
preparation programs “make necessary corrections and continuously improve” and help 
states reform low-performing programs. 14 The third is a market claim that new data systems 
will be used by prospective teachers, employers, and the public, thus motivating programs 
to improve. 15 


Validity of Federal Reporting Regulations as a Policy Instrument: What Is the 
Evidence? 

Recent history provides the best evidence regarding the first claim—that federally required 
accountability systems are effective at improving teacher preparation. As part of HEA Title II 
requirements, since 1998 preparation programs and states have reported to the federal gov¬ 
ernment. 16 These reports provide 17 years of evidence about how well federal requirements 
mandating state enforcement function as a mechanism for teacher preparation program 
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improvement. According to the DOE itself, this approach has failed: few low-performing 
programs have ever been identified by states, and according to DOE criteria, new teachers 
continue to be unprepared. 17 The revised rules give the states more flexibility than the rules 
originally proposed (i.e., states can determine how to designate low-performing programs) 
and reflect a major shift in data from the 1998 regulations (i.e., from program inputs to vari¬ 
ous measures of student learning outcomes), but the central mechanism—state enforcement 
of federal reporting requirements—is similar. History gives us good reason to doubt the first 
claim, that this is an effective policy mechanism to improve teacher preparation quality. 

The second claim is that making systematic data about preparation program outcomes 
public will facilitate program improvement. 18 Because this is similar to one of the claims 
about CAEP accreditation as a policy instrument (below), the following discussion applies 
to our review of both HEA regulations and CAEP accreditation. Both allow multiple mea¬ 
sures of student learning, but require the use of statewide standardized achievement tests 
in determining teacher preparation program effectiveness, with HEA requiring state re¬ 
port cards and CAEP accreditation requiring institutional self-studies. The most relevant 
evidence about the usefulness of students’ test scores as a measure of teacher preparation 
program effectiveness comes from research on state systems linking program graduates’ 
value-added teaching scores to their preparation programs as a way to evaluate and improve 
programs. This approach has been very controversial, and there are multiple reasons for 

caution. 19 However, we drew several conclusions from 
this research: (1) it is difficult to disentangle the effect 
of graduates’ characteristics from the impact of pro¬ 
grams; 20 (2) some states (Tennessee, Louisiana, Flori¬ 
da, Texas, North Carolina, and Ohio) are already using 
longitudinal data systems to link programs and student 
test scores; 21 (3) with value-added assessments, myriad 
technical decisions about selection, estimation, and in¬ 
terpretation have major consequences for conclusions 
about program quality, 22 and there is much “methodological messiness;” 23 and (4), in some 
states and/or labor markets, the results of value-added assessments have identified pro¬ 
grams whose graduates consistently outperform or underperform other teachers—but the 
differences in impact are small, and in some studies, nonexistent. 24 For these and other rea¬ 
sons, many analysts caution that value-added evaluations should not be given much weight 
in policy decisions about teacher preparation. 25 

This list of conclusions is notable for what is absent: evidence that value-added assessments 
of teacher preparation programs provide a “feedback loop” for improvement. 26 In Tennes¬ 
see, yearly report cards show that most institutions are consistent in ratings from year to 
year, and changes cannot be attributed to institutions’ use of the ratings. 27 The developers of 
Louisiana’s value-added system point this out explicitly, stating that results “do not answer 
why a particular result occurred or what might be done to improve on it; rather, all it does 
is provide feedback on performance.” 28 In fact we could locate no studies that systematically 
investigated whether or how programs actually used value-added and other outcomes data 
for improvement. Even some strong advocates concede that the effects on program improve¬ 
ment of outcomes-based state program reviews and/or national accreditation have not been 
empirically demonstrated. 29 

The third claim behind the proposed federal regulations is that new public accountability 
systems will be used by those seeking information about preparation quality, thus motivat- 
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ing programs to improve through market forces. There is little evidence along these lines. 
However, in some states where teacher tests were implemented in the late 1990s coincident 
with then-new HEA reporting regulations, institutions were accused of “gaming the system” 
by requiring teacher tests as an admission—but not an exit—requirement, thus producing 
(meaningless) 100% pass rates for their programs. 30 This kind of response casts doubt on 
the market claim. 

The validity of federal reporting regulations as an accountability policy depends on how well 
it fulfills its three claims. We argue that there is good reason to doubt that state develop¬ 
ment and enforcement of a federally required reporting system is an effective mechanism 
for improvement. Further, although there is a growing body of work about the development 
of systems for value-added assessments of teacher preparation programs, there is almost 
no evidence that these provide actionable evidence or serve as a useful feedback loop for 
improvement. Finally, there is little evidence that the market impact of public data about 
preparation leads to meaningful reform. 31 


CAEP Accreditation 

The Council for the Accreditation of Educator Preparation (CAEP) was founded in 2013 as a 
merger between the National Council for the Accreditation of Teacher Education (NCATE) 
and the Teacher Education Accreditation Council (TEAC), which created a single national 
educator preparation accreditor 32 that provides for indirect professional accountability. Al¬ 
though state approval of teacher preparation programs is required in all states, 18 currently 
have partnerships with CAEP. 33 To meet CAEP standards, programs are accountable for the 
systematic collection, management, and synthesis of reliable and valid evidence about: the 
quality of candidates; effectiveness of completers and programs based on value-added or 
other measures of teaching effectiveness as well as surveys of employers and completers; 
program outcomes, measured by graduation, licensure, and hiring rates; and, student loan 
default rates. 34 This array of evidence is similar to the data that new HEA regulations require 
states to develop. 


Rationale: What Claims Are Made about CAEP Accreditation? 

CAEP’s advocates assume that the public has lost confidence in preparation programs be¬ 
cause they have not produced teachers who close the achievement gap and teach all students 
to world-class standards. 35 For CAEP advocates, the presumed cause of this problem is the 
teacher education profession itself and its failure to make decisions on the basis of evidence 
about graduates’ and programs’ impact. 36 Working from the assumption that the profession 
does and should have jurisdiction for regulating preparation, CAEP’s goal is to “raise the 
bar” through tougher requirements 37 and become the “gold standard” in higher education 
accreditation. 38 CAEP’s theory of change is that the continuous use by the profession of data 
systems featuring “revolutionary” approaches to assessing teachers’ and programs’ impact 39 
will promote program accountability and improvement. 40 

The rationale for CAEP accreditation as a policy instrument relies on three claims. The first 
is that a national accreditation system developed and managed by the profession 41 is an 
effective mechanism for raising standards and thus improving the quality of preparation, 
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defined primarily as graduates’ impact on students’ learning. 42 The second closely related 
claim is that in the process of meeting standards for accreditation, programs will engage 
in “continuous improvement and innovation” 43 based on reliable and valid evidence about 
outcomes; this process will enhance teacher education and teaching quality. The third claim 
is that an accreditor-created massive database containing systematically collected perfor¬ 
mance data will provide usable consumer information, thus restoring policymakers’ and the 
public’s trust in the teacher education profession. 44 


Validity of CAEP Accreditation as a Policy Instrument: What Is the Evidence? 

Given that no institutions have yet completed the full CAEP accreditation process, there is 
no direct evidence about the efficacy of CAEP national profession-managed accreditation as 
a mechanism for raising standards and improving preparation programs. Although studies 
about the impact of national accreditation by CAEP’s forerunner, NCATE, are related, there 
is little evidence overall in this area. 45 The results are mixed regarding the relationship be¬ 
tween teacher candidates’ completion of an NCATE-approved program and their scores on 
licensure tests, 46 and there is virtually no evidence about whether completion of an NCATE 
program predicts teachers’ practices or career trajectories. 47 More importantly, however, 
even though NCATE and CAEP involve very similar policy mechanisms—institutions seek¬ 
ing national professional accreditation conduct an institutional self-study according to ac¬ 
creditor-established and profession-vetted standards followed by peer review—their stan¬ 
dards and evidence requirements are dramatically different. CAEP drastically shifted the 
evidence requirements for accreditation—from NCATE’s inputs, processes, and assessments 
of candidates’ knowledge and skills to CAEP’s outcomes, performance, and consumer satis¬ 
faction. CAEP’s outcomes approach essentially eliminates the “middle man” in the evidence 
game by ignoring a host of relevant variables that intervene between teachers’ preparation 
and the achievement of their eventual students. Because programs must now show direct 
evidence of their graduates’ effectiveness, employment patterns, and consumer satisfaction 
to earn CAEP accreditation, there is not much need for external examinations of correlations 
between CAEP accreditation status and program graduates’ licensure test scores and/or ac¬ 
creditation status and graduates’ performance. 

A major question here is whether measures of outcomes, especially program graduates’ im¬ 
pact on students’ achievement, are valid assessments of teacher preparation program qual¬ 
ity in the first place, an issue that is enormously controversial, 48 prompting many calls for 
caution from measurement experts and professional organizations. 49 The National Academy 
of Education’s analysis of teacher preparation evaluation concluded that CAEP legitima¬ 
cy hinges on: whether preparation programs can provide evidence of graduates’ impact on 
students’ achievement, how other measures of teacher effectiveness can be integrated, and 
whether differences in the features of preparation can be linked to graduates’ classroom 
effectiveness. 50 There have been very few studies along these lines. 51 There have also been 
multiple concerns about CAEP accreditation as an accountability mechanism, including lack 
of confidence within the collegiate teacher education community that CAEP has the capacity 
to manage the reformation of a field fraught with competing policy and political agendas. 52 
Further, questions have arisen about how programs can demonstrate impact on achieve¬ 
ment if they are not in states with already existing value-added assessment systems, 53 and 
there are mounting reservations about what these approaches leave out of assessment, such 
as diversity and social justice goals. 54 
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CAEP’s second claim is that the accreditation process will promote “cycle[s] of evaluation 
and continuous improvement” 55 that are “sustained by quality assurance systems,” 56 cen¬ 
tering on new teachers’ effectiveness, employment, and consumer satisfaction. As we noted 
above, this claim is very similar to the second claim involved in the rationale for new HEA 
rules, which we discussed above. As we show, there are studies that describe the develop¬ 
ment of value-added systems in certain states, but they offer almost no evidence about how 
these systems actually provide for and/or motivate continuous improvement. 

CAEP’s third claim is that the massive database that CAEP plans to accumulate will serve 
as a clearing house for consumer information about preparation, which will restore trust in 
the teacher education profession. As we noted above, there is no evidence along these lines. 

The validity of CAEP accreditation as a profession-managed accountability policy that will 
improve teacher education hinges on the evidence for its three claims. Our conclusion is that 
there is very good reason to doubt the first claim about the efficacy of a profession-managed 
national accreditation system to raise standards by focusing primarily on students’ achieve¬ 
ment scores. Further, although there is research about the development of state-level val¬ 
ue-added assessments of preparation programs, there is little evidence that these systems 
provide actionable feedback loops for program improvement. Finally, as far as we can deter¬ 
mine, there is no relevant evidence that speaks to the impact on policymakers’ or the public’s 
trust in teacher education as a result of making accreditation data public. 


NCTQ Teacher Prep Review 

The Teacher Prep Review is an evaluation of collegiate and alternative teacher preparation 
programs in the U.S., 57 conducted by the National Council on Teacher Quality (NCTQ), a 
private advocacy group. 58 The Teacher Prep Review ranks preparation programs based on 
NCTQ-developed input criteria using publicly available and solicited information, including 
syllabi and student teaching guidelines. 59 NCTQ’s rankings are published in US News and 
World Report and disseminated through multiple press releases and NCTQ’s Path to Teach, 
an online consumer guide to preparation options. 60 Although the Teacher Prep Review is 
not technically a policy instrument, it “holds teacher education accountable” through indi¬ 
rect market accountability in ways similar to some of the other initiatives we review, and it 
has become a powerful influence on policy related to teacher education. 61 Thus we ask the 
same questions about the Teacher Prep Review that we ask about the other accountability 
initiatives reviewed here. 


Rationale: What Claims Are Made about the Teacher Prep Review ? 

NCTQ asserts that the cause of educational decline in the U.S. is teacher education programs 
that are chaotic, out of sync with policy and public demands, and incapable of preparing 
teachers to perform in the classroom. 62 The rationale for the Teacher Prep Review depends 
on a market claim and an effectiveness claim. According to NCTQ, the only way to enforce 
standards in the vast and uneven field of teacher education is by “fully engaging] the un¬ 
paralleled power of the marketplace” by shining a “harsh spotlight on programs [which] is 
highly motivating to them.” 63 NCTQ’s theory of change is that their creation of “the largest 
database on teacher preparation ever assembled... [will] set in place market forces that will 
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spur underachieving programs to recognize their shortcomings and adopt methods used by 
higher scorers.” 64 

NCTQ’s effectiveness claim is that teacher preparation programs that are highly rated by 
NCTQ produce teachers who are more effective than other teachers. This is a fundamental 
premise of the Teacher Prep Review: “The National Council on Teacher Quality (NCTQ) 
has long been an advocate for the idea that ‘effective’ teaching must be rooted in academ¬ 
ic results for students. Whatever else they accomplish in the classroom, effective teachers 
must improve student achievement.” 65 Although the effectiveness claim is not emphasized 
in NCTQ’s promotional materials with the same force or frequency as its market claim, the 
former is in fact the sine qua non of the entire rationale behind the viability of the Teacher 
Prep Review as an accountability mechanism that will drastically improve teacher education 
quality. 


Validity of Teacher Prep Review as an Accountability Instrument: What Is the 
Evidence? 

There is no existing empirical research about the impact of the Teacher Prep Review on the 
market. 66 The most relevant evidence comes from studies about how prospective students 
and institutions respond to US News and World Report’s annual rankings of the nation’s 
colleges and universities as undergraduate institutions. 67 The findings of several key stud¬ 
ies 68 lead to the conclusion that USNWR undergraduate rankings do indeed have some im¬ 
pact on the application/admissions decisions of potential students as well as the admissions 
and other behaviors of higher education institutions. However, some analyses raise ques¬ 
tions about whether rankings-prompted institutional changes actually make sense educa¬ 
tionally or are only attempts to game the system by engaging in “sneaky” actions deliberately 
intended to influence rankings. 69 

If we simply extrapolated from the research on the impact of USNWR’s rankings on under¬ 
graduate colleges and universities, it would seem reasonable to conclude that some prospec¬ 
tive teachers might be more likely to apply to programs ranked higher in NCTQ’s Teach¬ 
er Prep Review. However the major assumption behind NCTQ’s market claim is that the 
rankings will prompt programs and institutions to change in order to improve their future 
rankings. However the primary reason the USNWR undergraduate rankings have an impact 
on institutions is that the institutions themselves—whether public or private, elite or not— 
take them so seriously, in some cases paying excessive attention. 70 This reasoning cannot 
be extrapolated to the NCTQ rankings of teacher preparation programs. In fact many col¬ 
legiate programs and professional organizations put little stock in the NCTQ rankings with 
most private and many elite institutions not participating, 71 and many publics participating 
only through compulsion. 72 In addition NCTQ’s review has been publicly critiqued by many 
of these institutions through widely disseminated open letters and other statements that 
raise major questions about NCTQ’s methods, motives, and procedures. 73 Further the NCTQ 
standards have not been vetted by the profession, 74 and they lack full alignment with the 
Berlin Principles, an internationally approved set of criteria for higher education ranking 
systems. 75 We conclude that there is good reason to doubt NCTQ’s market claim that the 
“harsh spotlight” 76 of the rankings will prompt substantial changes in the policies and prac¬ 
tices of teacher education programs. 

Research that speaks to NCTQ’s effectiveness claim is limited but direct. The most relevant 
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evidence is a 2015 study by Henry and Bastian that investigated the association between 
NCTQ’s ratings of preparation programs and two measures of teacher performance- 

teachers’ value-added scores and their evaluation ratings—for more than 4500 first or sec¬ 
ond year teachers in North Carolina. 77 Henry and Bastian found that higher NCTQ program 
ratings generally did not predict either higher teacher value-added scores or better teacher 
evaluations. 78 They concluded: “With our data and analyses, we do not find strong rela¬ 
tionships between the performance of teacher preparation program graduates and NCTQ’s 
overall program ratings or meeting NCTQ’s standards.” 79 Two smaller studies corroborate 
Henry and Bastian’s general conclusions. Fuller noted that NCTQ program ratings and li¬ 
censure pass rates among Texas teacher candidates were not correlated 80 and that NCTQ 
program ratings did not predict teachers’ value-added scores in the state of Washington. 81 
Dudley-Marling found no relationship between the proportion of preparation programs 
meeting NCTQ’s criteria for teaching early reading in individual states and the NAEP read¬ 
ing performance of pupils in those same jurisdictions. 82 

The validity of the Teacher Prep Review as an accountability instrument depends on wheth¬ 
er it fulfills its market and effectiveness claims. Our conclusion is that there is reason to 
doubt the market claim and that there is evidence showing that NCTQ program ratings do 
not predict the effectiveness of the graduates of those programs. This calls into serious ques¬ 
tion the validity of NCTQ’s Teacher Prep Review as an accountability mechanism that will 
boost teacher education quality. 


The edTPA 

The edTPA is a nationally available assessment used to evaluate the performance of teacher 
candidates, currently required for teacher licensure in a number of states and widely used in 
programs across the country. 83 Designed to assess what candidates do and how they reflect 
on their work, the edTPA was developed at Stanford University’s Center for Assessment, 
Learning, and Equity (SCALE), 84 partly as a corrective for narrowly focused standardized 
certification exams. 85 Based on the Performance Assessment for California Teachers (PACT) 
and informed by work related to the National Board for Professional Teaching Standards 
(NBPTS), the edTPA requires teacher candidates to submit a portfolio of lessons and reflec¬ 
tions, which is scored by external reviewers through a data storage and evaluation system 
managed by Pearson, Inc. 86 Proponents of the edTPA, which is based on indirect profession¬ 
al accountability, promote its use as a requirement for initial teacher licensure and its use as 
part of a package of evidence for national program accreditation. 87 


Rationale: What Claims Are Made about the edTPA? 

Proponents of the edTPA assume that teacher quality is a major determinant of student 
achievement. 88 They locate the current problem of uneven teacher education quality within 
the profession itself—specifically teacher education’s failure to develop as a legitimate pro¬ 
fession with uniform expectations about what teachers know and can do. 89 The assumption 
behind the edTPA is that the way to fix the problem is implementation of widespread state¬ 
wide licensure policies that require authentic performance assessments. 90 

The rationale for the edTPA as a policy instrument that will boost teacher education quality 
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and ultimately teacher quality involves three closely coupled claims. The first is an authen¬ 
ticity claim—that the edTPA is a valid and authentic measurement tool that both reflects and 
predicts teacher candidates’ success in the classroom 91 and thus ultimately improves learn¬ 
ing for students. 92 The second is the professional claim that implementing the edTPA will 
positively impact the professional learning of teacher candidates and prompt continuous 
program improvement and renewal. 93 The third is a professionalization claim about broad 
impact: widespread implementation of the edTPA will steer improvement from inside the 
profession, 94 and teacher education’s self-regulation will boost the status of the profession 
in the eyes of policymakers and the public. 95 


Validity of the edTPA as a Policy Instrument: What Is the Evidence? 

To assess validity and effectiveness claims about the edTPA, we reviewed evidence regarding 
the edTPA itself, the PACT, and National Board for Professional Teaching Standards certi¬ 
fication. Internal studies of the PACT 96 and edTPA 97 were conducted internally, by the orga¬ 
nizations that developed the instruments—researchers at Stanford University and SCALE, 
respectively. These traditional psychometric studies indicated that both the PACT and the 
edTPA were valid and reliable tools for the assessment of individual teacher competence and 
licensure decisions. On the other hand, some analyses based primarily on teacher educators’ 
experiences have asserted that these assessments are not appropriate for all fields, have 
limited conceptions of teaching and learning, and do not acknowledge the impact of school 
constraints, local contexts, and social justice aims. 98 In response, an extensive conceptual 
analysis authored by a member of the edTPA national design team concluded that despite 
claims to the contrary, the edTPA does allow for candidates’ equity and social justice aims. 99 

The predictive validity of the edTPA has not been established. 100 However, there is related 
evidence from the PACT, which is the edTPA’s forerunner. A pilot and follow up study of the 
PACT (with 105 and 14 candidates, respectively) linked teacher candidates’ PACT scores with 
value-added assessments and concluded that PACT scores predicted later teaching effective¬ 
ness, as evidenced by statistically significant but small differences in students’ achievement 
gains. 101 In addition, studies have concluded that although differences are small, NBPTS 
certified teachers are generally more effective than non-NBPTS teachers. 102 An important 
caveat here is that NBPTS results do not necessarily extrapolate to the edTPA, given import¬ 
ant differences in populations and purposes. The edTPA is a required assessment for teacher 
candidates to gain initial state teaching certification, while NBPTS assessments are part of 
optional advanced licensure opportunities for experienced teachers. 

A number of studies are relevant to the edTPA’s second claim that it promotes professional 
learning and prompts positive institutional and program change. Again we draw on both 
PACT and edTPA studies here. Most candidates who participated in pilots of the PACT re¬ 
ported that it was a source of learning and that their programs helped them prepare for the 
assessment, 103 although some also said the PACT could induce stress and take time away 
from other coursework and fieldwork priorities. 104 Candidates whose programs helped them 
prepare were more likely to report the PACT was a source of learning. 105 In contrast, most 
surveyed teacher candidates who took the edTPA in New York and Washington, the first 
two states where the edTPA was required for licensure, reported that the assessment was 
unfair, unclear, and time consuming and that their programs did not prepare them well; 
teacher candidates in Washington, where there was a gradual rollout of the edTPA, generally 
reported greater understanding and preparation than those in New York where high-stakes 
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implementation was sudden. 106 Finally, institutional self-studies of several large-scale insti¬ 
tutional implementations of the PACT or the edTPA concluded that implementation of the 
assessments had a positive institutional and curricular impact, 107 and that even when there 
were faculty concerns about alignment or loss of local control, 108 successful implementation 
depended on distributed leadership and support at all university levels. 

The third claim about the edTPA is that its widespread implementation will enhance public 
perceptions about collegiate teacher education as a self-regulating enterprise and boost the 
status of the profession. Only history can really speak to this claim. However it is worth not¬ 
ing that there have been multiple concerns about the edTPA within the teacher education 
community, including concerns about the role of Pearson, Inc., contracted by Stanford to 
manage edTPA data scoring and data storage, as well as concerns about the edTPA’s lack of 
attention to local cultural issues and its potential to undermine teacher educators’ profes¬ 
sional autonomy. 109 Concerns like these could preclude widespread acceptance of the edTPA. 
In addition, given that the dominant U.S. approach to education reform equates teacher 
effectiveness with boosting students’ test scores (and not with professional judgment, which 
is at the heart of the edTPA), it may be that those outside the profession will have no interest 
in the edTPA and thus it will not have the desired effect on professional status as perceived 
by outsiders. 110 

Based on the evidence, our conclusion is that the edTPA is a valid assessment of some valued 
aspects of teaching, although there is no evidence to date that the edTPA itself predicts ef¬ 
fectiveness. Implementation of the edTPA has the potential to prompt professional learning 
for candidates, programs, and institutions under certain conditions: alignment of edTPA 
and program/institutional goals and values, adequate institutional leadership and capacity 
building, and gradual supported implementation. However, as the problematic case of edT¬ 
PA implementation in New York State indicates, 111 these conditions may be difficult to meet. 
In addition, although we favor complex evaluations of teacher education, we acknowledge 
they will have major difficulties reversing the “common sense” conclusions of the education 
reform movement that collegiate teacher education is a failed enterprise. 112 


Holding Teacher Preparation Accountable: 

Thin Evidence, Thin Equity 

Although it is widely assumed that teacher quality is a critical determinant of student and 
school outcomes, 113 the role that teacher preparation plays is less clear. Teacher preparation 
has emerged as an acutely politicized and highly publicized issue in the U.S., and there have 
been fierce debates about whether, how, by whom, and for what purposes teachers should 
be prepared. This means that assessing the impact of major national initiatives designed to 
“hold teacher education accountable” is of great interest to policymakers, educators, and 
the public. Based on our extensive review of evidence and claims for four major initiatives 
intended to hold teacher preparation accountable, we conclude that for the most part, they 
are based on both thin evidence and a thin notion of equity that does not adequately account 
for the complex and longstanding out-of-school factors that produce and reproduce educa¬ 
tional inequality. 
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Thin Evidence 


Few would oppose the idea that those engaged in the enterprise of teacher preparation 
should be professionally accountable for their work. Yet our critique of the claims and ev¬ 
idence regarding four accountability initiatives raises many questions. Across three of the 
four initiatives (HEA regulations, CAEP accreditation, and NCTQ’s reviews), there is thin 
evidence to support the claims proponents make about how the assumed policy mechanisms 
actually operate—or would operate if implemented—to improve the quality of teacher prepa¬ 
ration. 114 Rather, the advocates of these initiatives assume a more or less causal relationship 
between the implementation of their summative evaluations and the improvement of teach¬ 
er preparation quality. This black box kind of logic is misguided. Summative evaluations 
intended to influence policy decisions generally do not provide usable information for pro¬ 
gram improvement. 115 The irony here is that while these policies call for teacher preparation 
programs and institutions to make decisions based on evidence, the policies themselves are 
not evidence-based, and there is good reason to question the validity of these initiatives as 
policy instruments that will have a positive impact on teacher education quality. 


The edTPA is something of an exception here. Because it builds on California’s PACT as¬ 
sessment of teacher candidates, the implementation of PACT serves as a pilot study for the 
edTPA, and there have also been multiple field tests of the edTPA itself. Thus, unlike the 

policy mechanisms of the other three initiatives reviewed 
here, the policy mechanism behind the edTPA—a state¬ 
wide uniform teacher candidate performance assessment 
required for initial teacher licensure —does have an evi¬ 
dentiary basis, although it is very limited as shown above. 
Even with the edTPA, however, major difficulties in the 
roll out of the edTPA in the state of New York raise many 
questions about the feasibility of statewide implementation. 116 In addition there are multiple 
concerns within the collegiate teacher preparation community about what the edTPA leaves 
out and about the problematic role of Pearson, Inc. These situations suggest that widespread 
statewide implementation and widespread professional acceptance of the edTPA as a uni¬ 
form assessment, both of which are essential to the edTPA’s capacity as a driver of change, 
will be challenging to accomplish. 


There is good reason 
to question the validity 
of these initiatives as 
policy instruments 


In addition to the fact that there is thin evidence about the feasibility of the above policy 
mechanisms to produce change, our review also raises questions about the definitions of 
teacher quality underlying the policies. Even though revisions to the HEA regulations give 
states the option to use multiple measures of students’ achievement, both the HEA regu¬ 
lations and CAEP procedures hinge on the assumption that teacher preparation program 
quality is primarily a function of teacher quality, defined in terms of graduates’ impact on 
students’ achievement, as measured by standardized tests. Defining student learning (and 
teacher quality) largely in terms of students’ test scores has been so widely challenged and 
critiqued as a narrow, limited, and superficial approach 117 that we do not elaborate further 
here, but we do add our voices to these challenges. In contrast, NCTQ’s teacher preparation 
reviews rely on an input-based definition of teacher education quality that is out of sync with 
both the dominant, though problematic, approach to education reform and with notions 
of teacher quality that are prominent within the collegiate teacher education community. 
Nevertheless, as we show above, the NCTQ reviews assume that teacher effectiveness is ulti¬ 
mately defined by student test scores and that the inputs the reviews emphasize are related 
to test score outcomes. 
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The definition of teacher quality underlying the edTPA reflects an alternative to, and a clear 
rejection of, value-added type assessments of teachers and teacher education programs, and 
instead focuses on teachers’ deep knowledge of how children learn, their complex classroom 
skills, and their professional judgment. 118 We see this definition of teacher quality as prom¬ 
ising but partial, and we return to this in our recommendations. 


Thin Equity 

At a general level, all four of the accountability initiatives we reviewed are intended to pro¬ 
mote equity by ensuring that all students have access to good teachers. But our analysis 
reveals that underlying most of these initiatives is what we call “thin equity,” borrowing 
from democratic theorist Benjamin Barber’s 119 now classic distinction between “thin” and 
“strong” democracy. Barber defined “strong” democracy as the participation of all of the 
people in at least some aspects of self-governance at least some of the time. In contrast he 
critiqued “thin” or representative democracy, suggesting that thin democracy was based on 
an individualistic, rights perspective rather than a strong participatory view. 

As noted above, we use the term “thin equity” to refer to teacher education accountability 
policies wherein equity is defined as equality or “sameness.” The assumption is that equity 
can be achieved by assuring that all students have the same access to “high quality” teachers 
(as noted, for HEA, CAEP, and implicitly NCTQ, this primarily means teachers who boost 
students’ test scores or other measures of achievement) but without addressing the larger 
historical and institutional systems of inequality and lack of participation that produced 
inequity in the first place. In contrast, the edTPA is grounded in a notion of teacher quality 
that involves teachers’ knowledge, skill, and professional judgment, including understand¬ 
ing the language demands of academic tasks, which supports English language learners. 
Even with this viewpoint, however, the edTPA does not give precedence to preparing teach¬ 
ers to understand and challenge inequities in the existing structures of schools and school¬ 
ing, to recognize and build on the knowledge traditions of marginalized groups, or to work 
with others as agents of social justice and social change. 

When policies work from a thin equity perspective, the assumption is that school factors, 
especially teachers, are the major sources of educational inequality, even though this is, as 
we stated above, a conclusion that is not based on evidence. This means that access to good 
teachers is assumed to be the solution to inequality. This viewpoint does not adequately 
acknowledge that inequality is rooted in and sustained by much larger, longstanding, and 
systemic societal inequities. In contrast, a “strong equity” perspective acknowledges the 
complex and intersecting historical, economic, social, institutional, and political systems 
that create inequalities in access and lack of access to teacher quality in the first place. This 
perspective assumes that equity cannot be achieved by teachers and schools alone; rather 
it requires educators working with policymakers and others in larger social movements to 
challenge the intersecting systems of inequality in schools and society that produce and re¬ 
produce inequity. 


Recommendations 

Although debate remains, educators and policymakers at multiple points along the political 
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spectrum are increasingly recognizing that reforming teacher preparation is an important 
part of larger efforts to improve the schools and enhance students’ learning. Based on our 
critique of claims and evidence related to four major national accountability initiatives, we 
offer the following policy recommendations. 

• Policymakers must acknowledge and address the multiple factors—in addition to 
teacher quality— that influence student outcomes, including in particular the im¬ 
pact of poverty, family and community resources, school organization and support, 
and policies that govern housing, health care, jobs, and early childhood services. 

• Systems evaluating teacher preparation must produce results that preparation pro¬ 
grams can use to change and improve curricula, practice-based experiences, and 
assessments—not results that simply grade programs without information about 
why or how particular results occurred or what might improve them. 

• Systems evaluating teacher preparation programs must be built on policy mecha¬ 
nisms that have documented capacity to produce usable information for local and 
larger program improvement within a complex policy and political climate. 

• There should be a conceptual shift away from teacher education accountability that 
is primarily bureaucratic or market-based and toward teacher education respon¬ 
sibility 120 that is primarily professional and that acknowledges the shared respon¬ 
sibility of teacher education programs, schools, and policymakers to prepare and 
support teachers. 

• Evaluations of teacher preparation programs should: 

o Reflect alternative forms of accountability that shift the focus from external¬ 
ly generated single-measure tests to multi-pronged internal assessments of 
teacher performance and student learning. 121 

o Not be based solely or primarily on students’ test scores; as the National 
Academy of Education report on teacher preparation evaluation recommends, 
state-level decision makers and K-12 administrators should avoid “placing too 
much weight” on value-added assessments of program graduates’ and pro¬ 
grams’ effectiveness. 122 

o Consider teacher educators’ performance (defined as knowledge, practice, 
commitments, and professional judgment as they play out in the construc¬ 
tion and operation of programs), teacher candidates’ performance (defined as 
knowledge, practice, commitments, and professional judgment as they play 
out in classrooms and schools), and students’ learning (defined as academic 
learning, social/emotional learning, moral/ethical development, and prepara¬ 
tion for participation in democratic society). 

o Recognize that teacher preparation programs have multiple, often complex, 
goals and purposes, including preparing teachers to challenge inequitable 
school and classroom practices and work as agents for social change. These 
goals, which are consistent with a “strong equity” perspective, should be re¬ 
flected in evaluation processes. 
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